\chapter{Introduction}
\begin{quote}
Information is physical.\\
--Rolf Landauer
\end{quote}
\section{Quantum Information}
Quantum mechanics is not only a theory of physical
systems -- of subatomic particles,
atoms, molecules, crystalline solids and so on -- it is also, and
perhaps primarily, a theory
of information.  Atoms and molecules, when static,
store information.  When engaged in physical processes they can be
thought of as performing a computation.  Moreover, it is possible to abstract away the
information-theoretic part of quantum mechanics from its embodiment in
particular physical systems, to talk of quantum bits and unitary logic
gates instead of atoms and Hamiltonians.  Some researchers even hope that
the postulates of quantum mechanics can be entirely replaced by purely
information-theoretic statements about the fundamental limits
placed by Nature on our ability to encode, decode and transmit
information\cite{Fuchs2002}. 

This reinvention of quantum mechanics as a theory of information has
led to some startling discoveries in the past twenty years.  It has
revealed that computers that store and manipulate
quantum systems can use exponentially fewer computational resources than ordinary
computers\cite{Shor1997}, even in the presence of errors\cite{diVincenzo1996}.  It
has shown us that information stored in quantum systems, unlike
ordinary information, cannot be copied.  It can, however, be `teleported'
from one system to another, through processes that destroy the
information in the original system and transfer it to the other\cite{Bennett1993}.
Furthermore, the peculiar way that quantum states can be
perfectly correlated while being perfectly random permits completely
secure communications,
something that is viewed as
impossible in classical information theory\cite{Bennett1992}.   

At the same time, our ability to manipulate, control and measure
all aspects of quantum systems has improved dramatically.  Single atoms can now be
reliably trapped, manipulated and interacted\cite{Itano1987}.  Single
photons can be
generated at will, either alone or in highly correlated multi-photon
states\cite{Kwiat1995}.  Single atoms can be made to interact with single photons with
exquisite control\cite{McKeever2003}.  Large, many-body systems can be created and
interacted coherently with a high degree of tunability in the system parameters\cite{Bloch2008}.  These advances have
permitted the development of an experimental branch of quantum
information science and to a great
many proof-of-concept demonstrations of the main ideas of quantum
information theory and foundational quantum mechanics.  These landmark experiments include the first
unambiguous violation of Bell's inequalities\cite{Aspect1981}, the
first demonstration of secure quantum
cryptography\cite{Bennett1992}, the first demonstrations of
teleportation\cite{Bouwmeester1997,Nielsen1998} and the demonstration of quantum
logic gates\cite{OBrien2004} and simple quantum
algorithms\cite{Chuang1998_2,Lanyon2007}.  While these experiments have
given us concrete examples of quantum information at work, they have
also created a need for a fuller understanding
of how best to connect the information gleaned from quantum measurements to
a full description of the quantum state.  It is the main aim of this
thesis to contribute to this important area of research.

The interaction between a
physical system and the information it contains is mediated via
measurement.  Measurement has always been a deep concern in quantum
mechanics, assuming a leading role in Heisenberg's uncertainty
principle, the postulates of quantum mechanics and important thought
experiments such as the EPR experiment\cite{EPR}, Bell's
inequalities\cite{Bell1964} and the Kochen-Specker
theorem\cite{Kochen-Specker} which formalize the
most irreconcilable differences between the classical and quantum
world.  In quantum mechanics, measurement
is disconnected from the underlying state of the
system in a way that it is not in classical mechanics.  As Niels Bohr put
it,
\begin{quote}
On the one hand, the definition of a physical system, as
ordinarily understood, claims the elimination of all external
disturbances.  But in that case, according to the quantum
postulate, any observation will be impossible, and, above all,
the concepts of space and time lose their immediate sense.  On
the other hand, if in order to make observation possible we
permit certain interactions with suitable agencies of
measurement, not belonging to the system, an unambiguous
definition of the state of the system is naturally no longer
possible, and there can be no question of causality in the
ordinary sense of the word.  The very nature of the quantum
theory thus forces us to regard the space-time coordination and
the claim of causality, the union of which characterizes the
classical theories, as complementary but exclusive features of
the description, symbolizing the idealization of observation and
definition respectively\cite{Bohr1928}.
\end{quote}

Despite this impossibility of simultaneously knowing everything about
a given quantum system, quantum systems can, nevertheless, be
fully characterized if enough identical copies are made of them.  This
characterization is conceptually similar to the way that a large
number of systems obeying classical mechanics with unknown
individual properties can be characterized by a
probability distribution.  In fact, it is becoming an increasingly
popular viewpoint among foundational researchers that, at its heart,
quantum mechanics is a statistical mechanical theory of
\emph{something}\footnote{This is one way of describing the so-called
  \emph{epistemic} view of quantum states.  This viewpoint holds that
  quantum states are fundamentally states of knowledge rather than
  states of reality.}, although what that
something is remains elusive\cite{Fuchs2002, Spekkens2007}.  

While this thesis does
not provide any answers that will revolutionize foundational quantum
mechanics, it does take seriously the notion that developing
methods for characterizing quantum states made in the lab can deliver
insight into what quantum states really \emph{are}.  To this end we
attempt to uncover the underlying symmetries in the description of
quantum states.  We look at how this description changes when
the individual identity of particles becomes distinguishable or
indistinguishable to experimental measurements.  Finally, we ask
what information it is possible to gain from a state directly, without
requiring that \emph{all} the information about a state be available.
These investigations into the art and practice of quantum state
estimation form the core of this thesis.    

The remainder of this chapter will introduce some fundamental
concepts that are essential to understanding the original work
presented in this thesis.  The basic theoretical and experimental
methodology employed in state creation and
state characterization will be laid out in Chapter 2.  After
that we will begin examining how quantum
state estimation can be extended, improved and understood in a variety
of contexts.  

While in some quantum systems, estimating the quantum state is
straightforward, in others, the effects of
interactions, indistinguishability, and unaccounted degrees of freedom
can make the question `What is the quantum state of my system?' an
extremely thorny one to even phrase in terms of clear experimental
observables, let alone answer.  Chapter 3 of this thesis examines
just such a situation
where the indistinguishability of photons makes their quantum state
impossible to describe using the standard methods of quantum state
estimation which rely on the treatment of each particle as a
distinct entity.  In contrast, by concentrating on the
observables which it \emph{is} possible to measure, we arrive at a complete,
elegant and scalable method of characterizing these
states.

Chapter 4 examines how to structure a
set of measurements so as to maximize the information extracted with
each one, leading to optimal state estimation
on a fixed number of copies of the state.  This
optimality is deeply connected to the geometry of the Hilbert
space of measurements and helps to relate that geometry to
operationally relevant parameters.  Additionally, this
optimal set of measurements is intimately tied to a description of
quantum states on a discrete
phase-space instead of a Hilbert space.  This new description has
several useful and intuitive
properties that can provide insight into the structure of quantum
correlations.  

Chapter 5 looks at how some
properties of states that are usually thought to require a lengthy and
complete characterization of the state can be obtained directly through
a judicious choice of measurements.  This will be an important
technology as quantum systems become larger and the exponential scaling of
the complexity of state estimation with the size of the system makes full state
characterization technically infeasible. 

\section{Concepts}
\subsection{The density matrix}
While one often uses state vectors to describe
the quantum state of a system in terms of its wavefunction, in
experimental work (and in many other situations) it is usually
preferable to describe the state using the more general density
matrix formulation.  The density matrix or density operator is a
linear, Hermitian operator on the Hilbert space of wavefunctions.  One
way to think of it is as a probability distribution over
projectors onto different wavefunctions.  That is to say
\begin{equation}
\rho=\sum_i p_i \ket{\psi_i}\bra{\psi_i}
\end{equation}
where the $p_i$ are probabilities, i.e. real numbers on $\left[0,1\right]$ such that
$\sum_i p_i=1$.  The non-negativity of the $p_i$ implies that the
eigenvalues of $\rho$ are non-negative or, equivalently, that
for any column $x$ and row $y$, 
$|\rho_{xy}|=|\rho_{yx}|\leq\sqrt{\rho_{xx}\rho_{yy}}$.  The
off-diagonal elements for which $x\neq y$ are called
\emph{coherences} while the diagonal elements are called
\emph{populations}.  

While the density matrix can be rotated to any basis by applying
unitary operations, we usually choose to express $\rho$ in a preferred
basis called the \emph{computational basis}.
The computational basis has the property that all of the basis
states are separable, that is to say they can be written as a
tensor product of single-particle states.  This is not true for
all possible density matrices as will be seen in the section on entanglement.

When the magnitudes of all coherences take on their maximum values, i.e. when
$|\rho_{yx}|=\sqrt{\rho_{xx}\rho_{yy}}$ for all $x$ and $y$,
the state is said to be pure.  Only one of the $p_i$ is non-zero,
and the density matrix is a projector onto a single
state-vector.  

If all of the coherences are zero for a density
matrix written in the computational basis, the state
can be thought of as a classical mixture and its measurement statistics are
governed solely by ordinary probability theory applied to the
single-particle measurement outcomes.  

A \emph{maximally-mixed
  state} is a density matrix proportional to the identity
operator with all of its coherences zero and all of its
populations equal.  

The purity\cite{MikeandIke} is a useful measure of whether the
state behaves more like a classical statistical mixture or
more like a pure quantum state.  The purity is defined as
\begin{equation}
P=\text{Tr}\left[\rho^2\right].
\end{equation}
This measure is invariant under unitary operations.

If $\hat{O}$ is a Hermitian operator on the Hilbert space of
states then the expectation value of the operator for a system
in the state $\rho$ is given by
\begin{equation}
\left<\hat{O}\right>=\text{Tr}\left[ \rho \hat{O}\right].
\label{eq:expectation_values}
\end{equation}

The advantage of the density matrix description is that it is
capable of describing both the statistics of quantum states and
quantum measurement and the classical statistics induced by
experimental randomness.  This makes it the most appropriate
description for experimentally generated states.  The density matrix also
has the property that, because it is a Hermitian operator, it is
itself an observable.  This makes it possible
to reconstruct the density matrix from measurements.   

Like a probability distribution, a density matrix is a
statistical description of a quantum state.  Depending on one's
preferred philosophy one can view it as being `really' a
description of the frequency of observation of certain
measurement outcomes, a state of knowledge about the outcome of
such measurements, or a description of reality for a subsystem of
a larger, pure, system.  These interpretational differences
make identical predictions for the outcome of experiments, of course, but one or the
other may be more convenient in understanding the density matrix
in a particular context. 

\subsection{Qubits}
One of the deepest insights of classical computer science is the 
realization that all information-carrying
systems are formally equivalent to a binary system of ones and
zeros called bits.  This insight allowed the theory
of redundancy, data compression and
informational entropy to be formulated completely independent of the physical
system carrying the information.  

Some of the earliest results in quantum information theory have
to do with the analogue of data compression\cite{Schumacher1994} for quantum
information.  It was in developing this theory that Schumacher first coined
the term qubit as a contraction of quantum bit to describe the
smallest quantity of \emph{quantum} information\cite{MikeandIke}.  Ever since, the qubit has
been used as an abstraction capable of distilling the information-theoretic
essence from a given physical
quantum system.

A qubit is a quantum two-level system described by a state vector in a
two-dimensional Hilbert space.  In the quantum information
literature it is typical to take the basis states to be
$\ket{0}$ and $\ket{1}$.  The experiments discussed in this
dissertation all involve a particular implementation of the
qubit, namely the polarization state of a single photon.
Since this is the only type of qubit we will be discussing, we
adopt the conventional notation of using
$\ket{H}$ and $\ket{V}$, the horizontal and vertical
polarization states as the basis states for our
qubit\footnote{The treatment of polarization as a vector in a two-dimensional
Hilbert space dates back to \emph{Jones calculus} invented by
R. C. Jones in 1941\cite{Jones1941}.}.  The beauty of the qubit
concept, though, is that any information-theoretic development
realized in one physical system like polarization is immediately
applicable to all other physical qubits like the spin state of
a trapped ion, the direction of current in a superconducting loop, the
magnetic moment of a hydrogen atom in nuclear magnetic
resonance, the excitation state of a neutral atom in a shallow
potential lattice or the spin state of an electron trapped on a quantum dot.

It is useful to label particular superpositions of the $\ket{H}$
and $\ket{V}$ basis states which will come up frequently in work with polarization.  Following the
conventions for Jones vectors\cite{Jones1941} (and as used in \cite{James2001}), we define 
\begin{align}
\text{Diagonal}\; \ket{D} &\equiv \frac{1}{\sqrt{2}}\left(\ket{H}+\ket{V}\right)\\
\text{Anti-diagonal}\;\ket{A} &\equiv \frac{1}{\sqrt{2}}\left(\ket{H}-\ket{V}\right)\\
\text{Left circular}\;\ket{L} &\equiv \frac{1}{\sqrt{2}}\left(\ket{H}+i\ket{V}\right)\\
\text{Right circular}\;\ket{R} &\equiv \frac{1}{\sqrt{2}}\left(\ket{H}-i\ket{V}\right).
\end{align}
Sometimes in the quantum information literature $\ket{L}$ and
$\ket{R}$ will have opposite signs to those given here.  

Along with $\ket{H}$ and $\ket{V}$, these states are eigenstates
of the Pauli operators defined as
\begin{align}
\sigma_x=\left(\begin{array}{cc}0 & 1\\ 1 & 0\\ \end{array}\right),
\sigma_y=\left(\begin{array}{cc}0 & -i\\ i & 0\\ \end{array}\right)
,\sigma_z=\left(\begin{array}{cc}1 & 0\\ 0 & -1\\ \end{array}\right).
\end{align}
$\ket{H}$ and $\ket{V}$ are the $+1$ and $-1$ eigenstates of
$\sigma_z$.  $\ket{D}$ and $\ket{A}$ are the $+1$ and $-1$
eigenstates of $\sigma_x$. $\ket{L}$ and $\ket{R}$ are the $+1$
and $-1$ eigenstates of $\sigma_y$.  
\subsection{The Bloch/Poincar\'e sphere}
Since the state of a qubit must be normalized, a convenient way
of writing it is 
\begin{equation}
\cos \theta \ket{H}+e^{i\phi}\sin\theta \ket{V}
\end{equation}
This parameterization of the qubit in terms of angles $\theta$
and $\phi$ is suggestive.  If we make the mapping
\begin{align}
x&=\sin 2\theta \cos \phi\\
y&=\sin 2\theta \sin \phi\\
z&=\cos 2\theta 
\end{align}
then by varying $\theta$ on $\left[0,\pi/2\right]$ and $\phi$ on
$\left[0,2\pi \right]$ this fully parameterizes a unit
sphere called the Bloch sphere.  This sphere provides a convenient visual representation
of the qubit.  Points that are antipodal on the Bloch sphere represent
orthogonal states of the qubit.  Overlaps between states can be
calculated solely from the relative angle between the two
corresponding points on the Bloch sphere.    

Mixed states can also be represented in this
description; they are points on the interior of the Bloch sphere.
Let $\rho$ be a single-qubit density matrix 
\begin{equation}
\rho=\frac{1}{2}\left(
\begin{array}{cc} 
1+z & x-iy\\ 
x+iy & 1-z
\end{array}\right)
\end{equation} 
It follows from the positivity constraint on density matrices
that $\text{det} \rho=\frac{1}{4}\left(1-|{\bf r}|^2\right)\geq
0$ where ${\bf r}$ is the real-space vector $\left(x, y, z\right)$.  Any
point satisfying this inequality, namely points on the surface
and interior of the Bloch sphere, represents a valid qubit density
matrix.  

Before there was the Bloch sphere there was the
Poincar\'e sphere.  Invented in 1891 by Henri Poincar\'e, the
Poincar\'e sphere represents classical polarizations in exactly
the same way that the Bloch sphere represents qubits.  There is
a difference of convention between the two descriptions.  The
north pole of the Poincar\'e sphere represents left-circular
polarization (i.e. the $\sigma_y$ $+1$ eigenstate) whereas the north
pole of the Bloch sphere represents $\ket{H}$ (i.e. the
$\sigma_z$ $+1$ eigenstate).  The Poincar\'e sphere has the
convenient feature that all linear polarization states
(i.e. those with a real-valued coherence between $\ket{H}$ and
$\ket{V}$) are located in the equatorial plane.  In this
dissertation we will primarily make use of the Poincar\'e sphere
description of polarization states since it is more natural for polarization.  To change to the
Bloch sphere picture the reader need only rotate his head
$90^\circ$ to the right.

\subsection{Qubit transformations}
The class of transformations that can be applied to a qubit
without changing its purity form a representation of the group
$SU(2)$.  They are most easily pictured as being rotations on
the Bloch/Poincar\'e sphere.  For this reason (and for brevity),
we often speak of polarization rotations as including the full
range of SU(2) transformations, not just rotations of linear polarizations.

For polarization, any SU(2) transformation can be made by making a
series of different phase delays about different axes.   A phase
delay of $\phi$ about an axis at angle $\theta$ rotates a point
on the Poincar\'e sphere through the angle $\phi$ about an axis on
the equatorial plane making an angle of $2\theta$ with the H/V
axis.  One popular method of making arbitrary polarization
transformations uses waveplates.  These are thin slices of
birefringent material meant to impart a fixed phase delay $\phi$ at a
variable angle $\theta$.  Typically, half waveplates with
$\phi=\pi$ and quarter waveplates with $\phi=\pi/2$ are used,
but one occasionally encounters plates with $\phi=2\pi$ and
$\phi=\pi/4$.  

A half waveplate can take any linear polarization to a different
linear polarization since a rotation of a ray on the equatorial
plane about another ray on the equatorial plane by an angle
$\pi$ will result in a ray on the equatorial plane.  Quarter
waveplates rotate by $\pi/2$ and so can take a linear
polarization into an elliptical polarization anywhere on the
hemisphere of states whose long axis is polarized in the same direction as
the initial linear polarization.

A quarter waveplate and half-waveplate together can take a linear
polarization to any point on the Poincar\'e sphere.  A
quarter waveplate, half waveplate and quarter waveplate can perform
an arbitrary rotation that takes any point on the Poincar\'e
sphere to any other point on the Poincar\'e sphere.  

While this is useful in principle, it can be difficult to enact in practice
since the parameters under control, the value of $\theta$
for each waveplate, are tightly coupled, making the system very
hard to fine-tune except for some very specific sets of angles.
A much more experimentally convenient polarization controller is
one that allows $\theta$ to be fixed and a variable $\phi$ to be
applied.  This is exactly the situation with liquid crystal
variable waveplates (LCWPs), which will be discussed in detail in the next
chapter.

\subsection{Multi-qubit states}
The Hilbert space of two qubits is spanned by the tensor products of the
basis states of the individual qubits, namely $\ket{HH},
\ket{HV}, \ket{VH}, \ket{VV}$.  The density matrix has $4\times
4=16$ elements.  While unitary transformations can be applied to
the two qubits individually there is also a class of
measurements called \emph{entangling operations} that cannot be
separated into actions on the individual qubits.  

\subsubsection{Entanglement}
\begin{quote}
If two separated bodies, about which, individually, we have
maximal knowledge, come into a situation in which they influence
one another and then again separate themselves, then there
regularly arises that which I just called \emph{entanglement} of
our knowledge of the two bodies.  At the outset, the joint
catalogue of expectations consists of a logical sum of the
individual catalogues; during the process the joint catalogue
develops necessarily according to the known law\ldots Our
knowledge remains maximal, but at the end, if the bodies have
again separated themselves, that knowledge does not again
decompose into a logical sum of the knowledge of the individual bodies\cite{Schrodinger1935}.\\
--Erwin Schr\"odinger
\end{quote}

Entanglement is viewed by many physicists as the strangest
aspect of quantum mechanics.  It arises when the density matrix of a system of two or more
particles is non-separable.  That is to say that the system
cannot be described by specifying the properties of the
individual particles within it.  In a sense, the information
about the state is contained not in the individual particles' properties,
but in the correlations between those properties.  Entanglement is one of the main properties
that makes quantum information different from classical information.

For example, a state like the following
two-photon polarization state
\begin{align}
\ket{\phi^+}&=\frac{1}{\sqrt{2}}\left(\ket{HH}+\ket{VV}\right)\\
&=\frac{1}{\sqrt{2}}\left(\ket{DD}+\ket{AA}\right)\\
&=\frac{1}{\sqrt{2}}\left(\ket{RL}+\ket{LR}\right)
\end{align}
is entangled.  Each photon individually has an
equal probability of being horizontally polarized or vertically
polarized, left or right circularly polarized or diagonally or
anti-diagonally polarized.  In classical polarization theory a
beam of light having this property would be considered
unpolarized.  The quantum state, though, also has correlations.
The two photons in the state will have the same polarization
when measured in the horizontal/vertical and
diagonal/anti-diagonal bases and will have opposite
polarizations in the left and right circular bases.  It was
shown by John Bell in 1964\cite{Bell1964} that
the complete randomness of individual particle measurements coupled
with the perfect correlations between the single-particle
measurements in all bases is inconsistent with classical probability theory
and local realism.  

$\ket{\phi^+}$ is one of four maximally entangled states called Bell
states that form a basis for two-photon polarization states.
Reference will often be made to these states in this thesis, so we list them here:
\begin{align}
\ket{\phi^+}
&=\frac{1}{\sqrt{2}}\left(\ket{HH}+\ket{VV}\right)
=\frac{1}{\sqrt{2}}\left(\ket{DD}+\ket{AA}\right)
=\frac{1}{\sqrt{2}}\left(\ket{RL}+\ket{LR}\right)\\
\ket{\phi^-}
&=\frac{1}{\sqrt{2}}\left(\ket{HH}-\ket{VV}\right)
=\frac{1}{\sqrt{2}}\left(\ket{DA}+\ket{AD}\right)
=\frac{1}{\sqrt{2}}\left(\ket{RR}-\ket{LL}\right)\\
\ket{\psi^+}
&=\frac{1}{\sqrt{2}}\left(\ket{HV}+\ket{VH}\right)
=\frac{1}{\sqrt{2}}\left(\ket{DD}-\ket{AA}\right)
=\frac{1}{\sqrt{2}}\left(\ket{RR}+\ket{LL}\right)\\
\ket{\psi^-}
&=\frac{1}{\sqrt{2}}\left(\ket{HV}-\ket{VH}\right)
=\frac{1}{\sqrt{2}}\left(\ket{DA}-\ket{AD}\right)
=\frac{1}{\sqrt{2}}\left(\ket{RL}-\ket{LR}\right).
\end{align}
$\ket{\psi^-}$ is sometimes called the singlet state and
$\ket{\psi^+}$ the triplet state.  

From a mathematical point of view, entanglement is an expression
of the impossibility of factoring the density matrix on the full
Hilbert space of the joint system into a product of density
operators on the subsystems.  

While the problem of characterizing entanglement for states of
more than two particles is complex, quantification of the entanglement of
bipartite systems is well-understood.  If the state of the whole system is
pure, then one may characterize the degree of entanglement by
performing a partial trace over one of the particles and
measuring the von Neumann entropy of the density matrix for the
remaining subsystem\cite{Bennett1996_2}.  The von Neumann entropy is
defined as 
\begin{equation}
S(\rho)=-\text{Tr}\rho \log \rho
\end{equation}  
It is the natural extension of the Shannon entropy to quantum states. 

For a density matrix $\rho=\sum p_i \ket{\psi_i}\bra{\psi_i}$
the natural definition of entanglement is the
average value of the entanglement of the $\ket{\psi_i}$ weighted
by the probabilities $p_i$.  Unfortunately, the decomposition of
$\rho$ into $\ket{\psi_i}\bra{\psi_i}$ is
not unique, and the choice of basis $\ket{\psi_i}$ will have an
effect on the average value of entanglement.  The most sensible
definition of entanglement is therefore the minimum value of
the average entanglement over all such decompositions.  As Wootters
and Hill have shown\cite{Hill1997}, this minimum is a simple
function of the $\emph{concurrence}$.  The concurrence can be
obtained by first calculating the matrix
$\mathbf{R}=\rho\mathbf{\Sigma}\rho^T\mathbf{\Sigma}$\cite{James2001} where
\begin{equation}
\mathbf{\Sigma}=\left(
\begin{array}{cccc}
0 & 0 & 0 & -1\\
0 & 0 & 1 & 0\\
0 & 0 & 0 & 0\\
-1 & 0 & 0 & 0
\end{array}
\right)
\end{equation}

The concurrence $\mathcal{C}(\rho)$ can then be obtained from the ordered eigenvalues of $\mathbf{R}$,
$\lambda_1\geq\lambda_2\geq\lambda_3\geq\lambda_4$ as 
\begin{align}
\mathcal{C}(\rho)\equiv\max\left(0,\sqrt{\lambda_1}-\sqrt{\lambda_2}-\sqrt{\lambda_3}-\sqrt{\lambda_4}\right)\\
\end{align}

The entanglement of $\rho$ is usually called the \emph{entanglement of
formation} and it can be expressed as
\begin{equation}
E(\rho)=H(\frac{1}{2}+\frac{1}{2}\sqrt{1-\mathcal{C(\rho)}^2})
\end{equation} 
where $H(x)$ is the binary entropy function $H(x)=-\left[x
  \log_2 x+(1-x)\log_2(1-x)\right]$.
\subsection{The Bell/CHSH inequalities}
Following John Bell's derivation of inequalities violated by 
quantum mechanics and inconsistent with local realism\cite{Bell1964}, Clauser, Horne,
Shimony, and Holt produced a simpler version of them\cite{Clauser1969} suitable
for testing in an optics experiment\cite{Freedman1972}.   We
imagine two photons being distributed to two parties, one of
whom measures the counts at two output ports of a polarizing
beamsplitter (PBS) at angle $\alpha$ and the other at the two output
ports of a PBS at angle $\beta$.  We define
the joint polarization correlation visibility as
\begin{equation}
V\left(\alpha,\beta\right)=\frac{N_{++}+N_{--}-N_{+-}-N_{-+}}{N_{++}+N_{--}+N_{+-}+N_{-+}}
\end{equation} 
where we have labeled the output ports of the PBSs as $+$ and
$-$ and $N_{+-}$, for instance, is the number of times that the
first party sees a photon at the $+$ port and the second
party sees a photon at the $-$ port.

We construct the CHSH $S$-function as 
\begin{equation}
S(\alpha_1,\beta_1,\alpha_2,\beta_2)=V(\alpha_1,\beta_1)-V(\alpha_1,\beta_2)+V(\alpha_2,\beta_1)+V(\alpha_2,\beta_2)
\end{equation}
It can be shown that no local, realistic hidden-variable theory
will predict a value for $|S|>2$\, but for
$\alpha_1=0^\circ$, $\alpha_2=45^\circ$,
$\beta_1=22.5^\circ$, and $\beta_2=67.5^\circ$, quantum
mechanics predicts a value of $S=2\sqrt{2}$ when the input state
is $\ket{\phi^+}$.

The quantity $S$ and the number of standard deviations by which
it exceeds the classical limit of $2$ is sometimes used as a
measure of the quality of a source of entangled
photons\cite{Kwiat1999}.  Violation of some form of Bell's
inequalities are sufficient to demonstrate entanglement, but not
all entangled states are capable of violating Bell's
inequalities.  Consequently, the violation of Bell's
inequalities is a more stringent test of entanglement than simply
demonstrating a value of concurrence greater than zero. 

\subsection{Quantum measurement}
Traditionally, measurements in quantum mechanics have been
described by projectors.  These are orthogonal,
idempotent, Hermitian operators.  A projective value measure or
PVM is a set of projectors that sum to unity,  
\begin{equation}
\sum_i \hat{P}_i=\mathbb{I}
\end{equation}
We call the individual projectors in a PVM elements of the PVM.
Sometimes we will use the term \emph{basis} interchangeably with PVM since by
virtue of their orthogonality the PVM elements, when of rank one,
form a basis for the projective Hilbert space.

PVMs are a special
case of a larger class of measurements called positive-operator valued measures
(POVMs)\cite{MikeandIke} which have been of great importance in
theoretical quantum information theory\cite{Fuchs2002}.  POVMs can be implemented by first
interacting the system under study with a larger quantum system
and then doing a projective measurement on the larger system.
For a variety of reasons, many argue that POVMs represent a more fundamental measurement primitive than do
PVMs\cite{Fuchs2002}.  That said, for experimental simplicity, for the experiments described in this thesis we will
restrict ourselves
to PVMs or, occasionally, to measurement operators that are
convex sums over PVMs.  

\subsection{Two-photon interference}
When two photons are incident on the input ports of a
non-polarizing beamsplitter, an interesting interference effect
called Hong-Ou-Mandel(HOM) interference
takes place\cite{Hong1987}.  If we label the input modes of the
beamsplitter $1$ and $2$ and the output modes $3$ and $4$, the
action of the beamsplitter is to map modes $1$ and $2$ onto
modes $3$ and $4$ as follows:
\begin{align}
a^\dagger_1 &\rightarrow \frac{1}{\sqrt{2}}\left(a^\dagger_3+a^\dagger_4\right)\\
a^\dagger_2 &\rightarrow \frac{1}{\sqrt{2}}\left(a^\dagger_3-a^\dagger_4\right)
\end{align}
If we put one photon in mode $1$ and another in mode $2$ at the
same moment in time, then the beamsplitter will map the two
photon state $a^\dagger_1 a^\dagger_2$ onto
\begin{align*}
a^\dagger_1 a^\dagger_2 &\rightarrow
\frac{1}{2}\left(a^\dagger_3+a^\dagger_4\right)\left(a^\dagger_3-a^\dagger_4\right)\\
&=\frac{1}{2}\left(a^\dagger_3 a^\dagger_3 - a^\dagger_3
a^\dagger_4 + a^\dagger_4 a^\dagger_3 - a^\dagger_4
a^\dagger_4\right)\\
&=\frac{1}{2}\left(a^\dagger_3 a^\dagger_3 -a^\dagger_4 a^\dagger_4\right)
\end{align*}
Through destructive interference, the probability amplitude for
a photon to be in mode $3$ and another in mode $4$ has
disappeared.  This interference effect cannot be thought of
classically in terms of interference of the electromagnetic
field, but it does have an interpretation in terms of Feynman
paths.  Since the events where both photons are
reflected at the beamsplitter and the events where both photons
are transmitted have indistinguishable outcomes, we expect them
to interfere.  That the interference is
destructive is a consequence of the quantum statistical nature
of photons as bosons.  Were the same experiment to be done with
fermions, the particles would leave out opposite ports all the
time and never out the same port.  

While Hong-Ou-Mandel interference is a simple consequence of linear
quantum field theory, it appears very much like an interaction.
Two photons meet and seem to `stick together' after meeting.
Indeed, it can be shown that this effect can be used to mediate
`effective interactions' that can provide the basis for a
quantum computing scheme called \emph{linear optics quantum
  computing}\cite{KLM2001}.

Two-photon interference is used in this thesis in a closely
related way.  We use it to make projective measurements onto
Bell states\cite{Weinfurter1994}.  While this method of
performing Bell-state measurements works, it has been proven
that it is impossible to perform a complete Bell-state PVM
deterministically using only
linear-optics\cite{Lutkenhaus1999}.  For many quantum
information applications,
though, even a partial Bell-state PVM capable of distinguishing only
one or two of the Bell states is still a very useful experimental tool.

\subsection{Bell state filtering}
Quantum mechanics allows for a multiparticle system to be
measured in an entangled basis.  This is called an entangling
measurement or a Bell-state measurement (BSM).  BSMs were crucial to
demonstrating the quantum teleportation protocol\cite{Bouwmeester1997} and other
quantum protocols\cite{Buhrman2001}.  They have the
interesting property that they determine correlations between 
single-particle states without collapsing those states
individually.  This means that with entangling measurements, correlations can be
probed in multiple bases at once.  The work in Chapter 4 and
Chapter 5 relies on entangling measurements to improve how
information is extracted from quantum systems. 

In the previous section we implicitly assumed that the two input
photons had the same polarization.  Let's now assume that the
input photons are in some arbitrary quantum polarization state
$\ket{\psi}=\left(\alpha a_{H1}^\dagger
a_{H2}^\dagger+ \beta a_{H1}^\dagger a_{V2}^\dagger+ \gamma a_{V1}^\dagger
a_{H2}^\dagger+ \delta a_{V1}^\dagger
a_{V2}^\dagger\right)\ket{vac}$ where $\alpha$, $\beta$,
$\gamma$ and $\delta$ are arbitrary complex constants selected so that
the state is normalized. 

The two photon interference will map the spatial modes in
the same way as before to produce an output state
\begin{align*}
\ket{\psi}&=\left[\right.\alpha \left(a_{H3}^\dagger a_{H3}^\dagger - a_{H4}^\dagger
  a_{H4}^\dagger\right)+ \beta \left(a_{H3}^\dagger
  a_{V3}^\dagger - a_{H4}^\dagger a_{V3}^\dagger + a_{H3}^\dagger
a_{V4}^\dagger - a_{H4}^\dagger a_{V4}^\dagger\right)+\\
&\gamma \left(a_{V3}^\dagger
  a_{H3}^\dagger + a_{V4}^\dagger a_{H3}^\dagger - a_{V3}^\dagger
a_{H4}^\dagger - a_{V4}^\dagger a_{H4}^\dagger \right)+ \delta \left(a_{V3}^\dagger a_{V3}^\dagger - a_{V4}^\dagger
  a_{V4}^\dagger\right)\left.\right]\ket{vac}\\
&=\left(\right.\alpha \left[a_{H3}^\dagger a_{H3}^\dagger - a_{H4}^\dagger
  a_{H4}^\dagger\right]
+\left(\beta+\gamma\right)\left[a_{H3}^\dagger
  a_{V3}^\dagger-a_{H4}^\dagger a_{V4}^\dagger\right]\\
&+\left(\beta-\gamma\right)\left[a_{H4}^\dagger
  a_{V3}^\dagger-a_{H3}^\dagger a_{V4}^\dagger\right]
+\delta \left[a_{V3}^\dagger
  a_{V3}^\dagger-a_{V4}^\dagger a_{V4}^\dagger\right]\left.\right)\ket{vac}
\end{align*}
We note that the only term that will give rise to a coincidence
detection (i.e. that has photons in both modes $3$ and $4$) is
the term with $\left(\beta-\gamma\right)$ as a coefficient.  It
is easy to show that
$\left(\beta-\gamma\right)=\braket{\psi^-|\psi}$ so that the
only component of the state that gives rise to a coincidence
count is the one lying along the maximally entangled state
$\ket{\psi^-}$.  For this reason, two-photon interference
followed by coincidence detection can be viewed as a
$\ket{\psi^-}$ filter or singlet state filter. 

Similarly, among the components of the state that do not give
rise to coincidences in modes $3$ and $4$, only the term with coefficient
$\left(\beta+\gamma\right)$ contains both a horizontal and a
vertical photon.  It can be shown that
$\left(\beta+\gamma\right)=\braket{\psi^+|\psi}$.   Thus if one looks at only one output port of
the beamsplitter, and detects events when one horizontal and one
vertical photon leave from that port, then one will have measured
the $\ket{\psi^+}$ or triplet state projection. 

In reality, the interference visibility of two-photon
interference is not perfect and so one has to slightly modify
this theory to take account of real experimental conditions. 
These modifications will be discussed when we look at the
experimental use of two-photon interference in Chapters 4
and 5.

\section{Summary}
We have looked at a number of concepts that will be
important in understanding the work presented in this thesis.
In the next chapter we will go deeply into the theory and
practice of quantum state estimation and explain the
experimental techniques used to prepare, manipulate and measure
quantum states.  
